Comparing direct G2P with G2P followed by accent conversion when determining pronunciations for South African English
نویسندگان
چکیده
It has been shown that techniques known as grapheme-and-phoneme-to-phoneme (GP2P) conversion can be used to derive pronunciations in a poorly-resourced accent, such as South African English, using available pronunciations in better-resourced accents of the same language, such as British and American English. However if the pronunciation is not available in either accent, it must be obtained using graphemeto-phoneme (G2P) conversion in either the source or the target accent. The question therefore arises whether it is better to apply G2P in the source accent and then GP2P to obtain the desired pronunciation in the target accent, or to apply G2P directly to the target accent. This study finds that if the source dictionary used has a high G2P accuracy (due to the dictionary’s size, regularity, or both), it is advantageous to generate a pronunciation in the source accent first using G2P, and subsequently convert this pronunciation to the target accent.
منابع مشابه
Data-driven phonetic comparison and conversion between south african, british and american English pronunciations
We analyse pronunciations in American, British and South African English pronunciation dictionaries. Three analyses are perfomed. First the accuracy is determined with which decision tree based grapheme-to-phoneme (G2P) conversion can be applied to each accent. It is found that there is little difference between the accents in this regard. Secondly, pronunciations are compared by performing pai...
متن کاملGenerating multiple-accent pronunciations for TTS using joint sequence model interpolation
Standard grapheme-to-phoneme (G2P) systems are trained using a homogeneous lexicon, for example one associated with a particular accent. In practice, a synthesis system may be required to handle multiple accents. Furthermore, a speaker rarely has a pure accent; accents vary continuously within and between regions of a country. Generating phonetic sequences for each accent is possible, but combi...
متن کاملDeep Bidirectional Long Short-Term Memory Recurrent Neural Networks for Grapheme-to-Phoneme Conversion Utilizing Complex Many-to-Many Alignments
Efficient grapheme-to-phoneme (G2P) conversion models are considered indispensable components to achieve the stateof-the-art performance in modern automatic speech recognition (ASR) and text-to-speech (TTS) systems. The role of these models is to provide such systems with a means to generate accurate pronunciations for unseen words. Recent work in this domain is based on recurrent neural networ...
متن کاملThe Festvox Indic Frontend for Grapheme-to-Phoneme Conversion
Text-to-Speech (TTS) systems convert text into phonetic pronunciations which are then processed by Acoustic Models. TTS frontends typically include text processing, lexical lookup and Grapheme-to-Phoneme (g2p) conversion stages. This paper describes the design and implementation of the Indic frontend, which provides explicit support for many major Indian languages, along with a unified framewor...
متن کاملImproving LVCSR with hidden conditional random fields for grapheme-to-phoneme conversion
In virtually every state-of-the-art large vocabulary continuous speech recognition (LVCSR) system, grapheme-to-phoneme (G2P) conversion is applied to generalize beyond a fixed set of words given by a background lexicon. The overall performance of the G2P system has a strong effect on the recognition quality. Typically, generative models based on joint-n-grams are used, although some discriminat...
متن کامل